19 Machine-level Data Types

When working in machine learning for graduate economics—or any computationally intensive field—understanding how data is represented at the hardware level is vital for ensuring numerical accuracy, preventing bugs, and making informed decisions about performance. High-level abstractions often mask the underlying details of how computers store integers, handle floating-point arithmetic, encode text, and interpret instructions. Yet these low-level details can profoundly affect the outcomes of computations, especially when dealing with large-scale data, sensitive financial figures, or tightly optimized algorithms. This appendix offers a thorough overview of the main categories of machine-level data types: unsigned integers, signed integers, floating-point numbers, text, and executable code.

The following sections include practical examples using NumPy, Python’s fundamental package for numerical computing. If you haven’t already, install NumPy with:

pip install numpy

19.1 Unsigned Integers

19.1.1 Binary Representation and Fixed-width Storage

At the heart of computing lies the binary number system, which uses just two digits, 0 and 1. Despite the simplicity of having only these two symbols, a binary representation can encode any nonnegative integer if enough bits are provided. Languages such as Python allow integers to grow arbitrarily large, but physical hardware operates with fixed-width registers and instructions that handle a specific number of bits at once—commonly 8, 16, 32, or 64 bits.

For example, the binary number 1101 is converted to decimal as follows:

1101 (binary) = (1 × 2³) + (1 × 2²) + (0 × 2¹) + (1 × 2⁰) = (1 × 8) + (1 × 4) + (0 × 2) + (1 × 1) = 8 + 4 + 0 + 1 = 13 (decimal)

Each position represents a power of 2, starting from 2⁰ (1) on the right and increasing as we move left. By multiplying each binary digit by its corresponding power of 2 and summing the results, we convert from binary to decimal.

When an integer is said to be “unsigned,” it means all possible bit patterns are interpreted as nonnegative values. If you have $n$ bits, there are $2^n$ distinct patterns of 0s and 1s. Because the smallest value is 0, the largest representable number is $2^n - 1$. For instance, with 8 bits (often called a “byte”) we have:

Smallest value (binary 00000000): 0
Largest value (binary 11111111): 255, which is $2^8 - 1$

Extending this principle:

A 16-bit unsigned integer ranges from 0 to $2^{16} - 1 = 65{,}535$.
A 32-bit unsigned integer ranges from 0 to $2^{32} - 1 = 4{,}294{,}967{,}295$.
A 64-bit unsigned integer ranges from 0 to $2^{64} - 1 = 18{,}446{,}744{,}073{,}709{,}551{,}615$.

Such integers are routinely employed in contexts like indexing arrays, counting objects, or storing memory addresses—situations where negative numbers are not meaningful. In a 32-bit system, for example, memory addresses typically require 32 bits, thus each address is an unsigned integer from 0 to $2^{32}-1$. On a 64-bit system, addresses use 64-bit unsigned values.

19.1.2 Overflow, Underflow, and the Clock Analogy

Because the bit width is fixed in hardware arithmetic, computations can “wrap around” if a result exceeds the representable range. This phenomenon is referred to as overflow (when adding or otherwise increasing causes the value to jump from the maximum back to something smaller, often zero) or underflow (when subtracting causes the value to jump from zero to a high value).

A helpful analogy is a clock—except that instead of having 12 or 24 discrete positions, an $n$-bit unsigned integer has $2^n$ positions.

In an 8-bit system, imagine a clock with 256 positions labeled 0 through 255:

Overflow occurs if you start at 255 and add 1, wrapping around to 0. For instance, if a register holds 254 and you add 1, you get 255, which is still within range. If you add another 1, you wrap to 0.
Underflow is analogous to turning the clock backward. If the register holds 0 and you subtract 1, it wraps around to 255.

import numpy as np

# Create array of unsigned 8-bit integers (0 to 255)
arr = np.array([254, 255, 0, 1], dtype=np.uint8)
print(f"Initial array: {arr}")

# Demonstrate overflow
print(f"Adding 1: {arr + 1}")  # 255 wraps to 0
print(f"Adding 2: {arr + 2}")  # 255 wraps to 1

19.1.3 Hexadecimal notation

Hexadecimal (base-16) notation is a convenient way to represent binary numbers, as each hexadecimal digit represents exactly four binary digits (bits). The sixteen digits are 0-9 for the first ten values, then A-F for values 10-15. For example:

Binary 0000 = Hex 0
Binary 1001 = Hex 9
Binary 1010 = Hex A
Binary 1111 = Hex F

This makes it easy to convert between binary and hexadecimal:

Binary 1101 0011 = Hex D3 (since 1101=D and 0011=3)
Hex 2F = Binary 0010 1111

Hexadecimal numbers are often prefixed with “0x” in programming languages to distinguish them from decimal numbers. For instance:

0xFF = 255 in decimal
0x100 = 256 in decimal
0xA5 = 165 in decimal

This notation is particularly useful when working with computer memory addresses or examining binary data, as it makes long strings of bits more manageable while maintaining a direct relationship to the underlying binary representation.

Here’s how to display integers in different formats using Python f-strings:

num = 42
# Print in decimal (base 10)
print(f"Decimal: {num}")
# Print in binary with 0b prefix
print(f"Binary: {num:#b}")  
# Print in hex with 0x prefix
print(f"Hex: {num:#x}")
# Print in binary padded to 8 bits
print(f"8-bit binary: {num:08b}")

Decimal: 42
Binary: 0b101010
Hex: 0x2a
8-bit binary: 00101010

The # modifier adds the 0b or 0x prefix. The 08b format specifier pads the binary representation with leading zeros to 8 bits.

19.2 Signed Integers

19.2.1 The Two’s Complement Convention

Unlike unsigned integers, signed integers can represent both positive and negative values. Modern hardware invariably does this using a scheme called “two’s complement.” Although historical architectures employed other representations, such as one’s complement or signed-magnitude, they have fallen out of use. Two’s complement is essentially a systematic way to encode negative numbers by treating bit patterns above a certain threshold as representing negative values.

To grasp the logic, consider an 8-bit integer. If one interprets all 8-bit patterns as unsigned, they range from 0 to 255. Two’s complement reassigns the higher half of these patterns (128 through 255) to represent negative numbers, by subtracting 256 from the unsigned interpretation:

00000000 (binary) = 0 (decimal)
…
01111111 (binary) = 127 (decimal)
10000000 (binary) = 128 in unsigned interpretation, but in two’s complement it is $-128$.
10000001 (binary) = 129 in unsigned, but $-127$ in two’s complement.
…
11111111 (binary) = 255 in unsigned, but $-1$ in two’s complement.

Hence, an 8-bit two’s complement integer spans $-128$ to $+127$. Generalizing, an $n$-bit two’s complement integer has a range from $-2^{n-1}$ to $2^{n-1} - 1$. For example, a 16-bit signed integer covers $-32768$ to $32767$, a 32-bit signed integer covers approximately $-2.1 \times 10^9$ to $+2.1 \times 10^9$, and so on.

19.2.2 A clock analogy

The behavior of two’s complement can also be understood through clock arithmetic. Suppose you have a traditional clock showing 2:45 (in hours and minutes). We can label this time in two different ways: we might say it is 45 minutes after 2:00, or equivalently that it is 15 minutes before 3:00, giving us -15 minutes relative to 3:00. This negative offset simply reflects a different convention in labeling the same position on the clock.

In an 8-bit two’s complement system, once the value goes beyond 127, it “wraps around” into the negative region. The range 128–255 is reused to represent negative integers from $-128$ up to $-1$. As an example, 148 in unsigned arithmetic might appear simply as 148. But if one interprets that same bit pattern in two’s complement, it becomes $148 - 256 = -108$. This mirrors the idea that 45 minutes can be -15 if you consider it as time until 3:00, simply by shifting your frame of reference.

19.2.3 Overflow and Sign Changes

Because two’s complement is basically a circular numerical system on $2^n$ points, arithmetic operations can yield results that “wrap around,” causing surprising sign changes. If you add two positive numbers and the result surpasses $2^{n-1} - 1$ (the largest positive representable value in $n$-bit two’s complement), you loop back into the negative values. For example, in an 8-bit system ($-128$ to $127$):

Adding 100 and 50 ideally yields 150, but 150 exceeds 127. The result wraps to $150 - 256 = -106$.
Adding $-100$ and $-50$ ideally gives $-150$. Since $-150$ is less than $-128$, it wraps around to $-150 + 256 = 106$.

Even multiplication can produce unintuitive sign flips. For instance, multiplying 64 by 5 is 320 in normal arithmetic. In an 8-bit two’s complement system, 320 is far above 127, so it wraps around to $320 - 256 = 64$. Despite both operands being positive, the final result in 8-bit arithmetic is 64, demonstrating how drastically overflow can alter outcomes.

import numpy as np

# Create array of signed 8-bit integers (-128 to 127)
arr = np.array([100, 50, -100, -50], dtype=np.int8)
print(f"Initial array: {arr}")

# Demonstrate overflow with addition
print(f"100 + 50 = {arr[0] + arr[1]}")  # Wraps to negative
print(f"-100 + -50 = {arr[2] + arr[3]}")  # Wraps to positive

# Demonstrate overflow with multiplication
x = np.array([64, 5], dtype=np.int8)
print(f"64 × 4 = {x[0] * 4}")  # Wraps around
print(f"64 × 5 = {x[0] * x[1]}")  # Wraps around

These behaviors underscore why safety checks are crucial. In fields like financial data processing or physical simulations, accidental overflow can invalidate entire computations. Programmers may mitigate these risks by using larger integer sizes (e.g., 64-bit instead of 32-bit), using arbitrary-precision arithmetic libraries, or explicitly detecting and handling overflows. Nonetheless, two’s complement remains the universal standard for signed integer representation because it is straightforward to implement in hardware and provides correct results for most practical calculations within its designated range.

19.3 Floating point numbers

In everyday arithmetic, you might jot down a decimal like 3.14 on paper, knowing it represents a truncated form of the irrational number π (pi). Similarly, on a computer, we usually cannot store exact real numbers; instead, we rely on approximations dictated by how data is physically stored in memory. The most common way fractional numbers are represented on computers are as floating point numbers. But before looking at them we’ll look at the simpler fixed-point representation and see why it is not usually the best choice. Though computers of course are going to use binary numbers, we will use decimal (base-10) numbers to explain the conceptual issues, since the base does not make a significant difference to the ideas.

19.3.1 Fixed-Point Representation

In a fixed-point system, we allocate a certain number of digits (in decimal) or bits (in binary) to the fractional part of a number. For a concrete example, imagine you want two digits to the right of the decimal point in decimal notation, maybe because we are representating monetary quantities in rupees and paise.

One simple way to do it is to multiple all quantities by 100 and store them as integers.

The number 37.42 is stored as the integer 3742. We interpret 3742 as 37.42 because of the implicit rule that the last two digits belong after the decimal point.
Similarly, the number 0.05 becomes 5 internally, since we again assume the last two digits are after the decimal.
If the number has more digits after the decimal point, we round it first. So 1.324 would first be rounded to 1.32 and then stored as 132.

This storage approach is conceptually simple: you just treat every “fixed-point number” as a scaled integer. Only when displaying them we remember to place a decimal point before the last two digits.

A hallmark of fixed-point is constant absolute precision (or resolution). In the above example with two decimal digits:

The smallest increment is 0.01. You can represent 37.41 and 37.42, but not 37.415.
Hence, the maximum rounding error for any number is ±0.005 in decimal.
Another way to see this is that the space between any two neighboring numbers is 0.01—this gap never changes, regardless of whether you’re around 1.00 or 9999.99.

If your range is small or you only need a fixed absolute precision, this is actually very convenient. However, as soon as you need to represent both extremely large and extremely small values with precision, the uniform step size becomes a bottleneck.

This limitation is usually described as “fixed-point has a fixed range.” Indeed, if you want to store bigger numbers, you must increase the integer storage—and that necessarily squeezes out bits (or digits) that could have been used for the fractional part, lowering the precision. If you want very high precision, you have fewer bits for the integer part, lowering the maximum representable value.

If constant absolute error is what you need, there is no way out of this dilemma except to allocate more bits to your representation. However, it turns out that in most numeric applications you don’t need constant absolute error. It is constant relative error that is more important.

19.3.2 Relative vs. Absolute Error

Absolute Error: The absolute value of the difference between the true value and the represented value. So representing 500 as 502 involves an absolute error of 2.
Relative Error: The ratio of the absolute error to the value’s magnitude. So representing 500 as 502 has a relative error of $2/500$.

In many scientific or mathematical scenarios, relative precision is what matters. If you are calculating the budget of the Government of India, an error of a few lakh rupees is negligible. If you are calculating the budget of a professor’s household, such an error can be catastrophic!

How would you store numbers if you were willing to accept varying absolute errors as long as relative errors stay constant? If you were of a mathematical bent of mind you might get the idea of taking logarithm and then storing the logarithms as fixed-precision numbers since logarithms convert absolute changes into relative changes. So an additive error of $\epsilon$ in storing the number would lead to a multiplicative, i.e. relative, error of $10^\epsilon$ in the actual numbers if we were using base-10 logarithms. In return, the range of numbers you could represent with a fixed number of digits would be greatly increased. With only two decimal digits for the integral portion you would be able to represent numbers as large as $10^{99}$.

However, this scheme has two problems:

Logarithms are expensive to compute
Multiplication and division would be easy (which is what logarithms were invented for), but additions and subtractions would require us to take exponentials, carry out the operations and take logarithms again. Since additions/subtractions are much more common than multiplications/divisions, this would not be a very good situation.

Floating point representation is a halfway house between fixed-point representation and logarithmic representation which addresses these issues.

19.3.3 Floating point representation

The floating point representation of a positive number $f$ is constructed by finding two numbers $m$ and $a$ such that we have \[f=m \times 10^a\] with \[1 \le m < 10\] and $a$ an integer. Simply put, $a$ is the number of times $f$ has to be divided or multipied by 10 to bring it into the 1 to 10 range. So the representation of $123$ would be $1.23\times 10^2$ while that of $0.0123$ would be $1.23 \times 10^{-2}$.

The number $m$ is called the mantissa or significand and $a$ the exponent of $f$. We assign a certain fixed number of digits to encode the mantissa and another certain fixed number of digits to encode the exponent.

The number of digits assigned to the mantissa controls the relative precision of our representation. Say we have only three decimal digits for the mantissa. Then if we are asked to represent $1234$ we have to round it off to $1230 = 1.23 \times 10^3$ We have an absolute error of $4$ and a relative error of $4/1234$. If instead we had to represent $0.1234$ we would have to round it off it $0.123 = 1.23 \times 10^{-1}$. In this case we have an absolute error of $0.0004$ and a relative error of $0.0004/0.1234$ which is the exactly the same as our earlier relative error of $4/1234$. So the absolute error changes with the scale of the number represented, but the relative error remains the same, depending only on the number of digits in the mantissa.

The number of digits assigned to the exponent determine the range of values we can represent. Working in base 10, if we assign two digits to the exponent and five to the mantissa the largest number we can store is $9.9999 \times 10^{99}$ and the smallest number we can store is $1.0000 \times 10^{-99}$, which is a much greater range than would have been possile with a fixed-point representation with the same number of digits.

The range of a floating point format has two “holes” on the real number line, numbers in which canonot be well-approximated by floating point number in that format. The first hole is for small number very close to zero, in the example above betweeen $0$ and $1.0000 \times 10^{-99}$, The other hole is for very large numbers, greater than $9.9999 \times 10^{99}$ in our example. When the result of a calculation falls in the first hole, we say we have underflow. When it falls in the second hole we say we have overflow.

If we had been willing to relax the constraint $1 \le m < 10$ on the mantissa we could represent even smaller numbers, for eg. $0.0001 \times 10^{-99}$. This has a cost though. If we relax the constraint, we no longer have fixed relative errors. But for many cases paying this accuracy cost is better than having small results being rounded to zero. Small floating point numbers which violate the constrain on the mantissa are called denormal number and allowing results to be represented as such numbers is called allowing for gradual underflow.

Finally, if we store information about signs separately, we can represent negative as well as positive numbers.

19.3.4 Roundoff Error

With a fixed number of digits to represent the significand, situations arise when the result of an arithmetic operation cannot be represented exactly and have to be rounded off. For example, say we have four decimal digits for the significant and we calculate $5.565 + 5.654$. The exact result is $11.219$ but since we have only three digits for the significand this has to be represented as $1.122 \times 10$.

In stastical and machine learning work when final results are a results of many arithmetic operations, roundoff erros like these can accumulate and significantly affect results. For example, just calculating the mean of a million numbers would require a million additions, and in unfortunate situations this can have a large impact on the result. Fitting a model to data using an optimization algorithm involves a much larger number of arithmetic operations and a much larger scope for accumulating round off errors.

19.3.5 Binary Floating-Point and the IEEE 754 Standard

Coming to the acutal binary floating point formats used in computers today, nothing significant changes from our discussion above except that the normalization condition on the significand becomes $1 \le m < 2$. This means that the first digit, the digit to the left of the decimal point in the mantissa, has to be $1$ and so there is no need to explicitly store it.

Also, to avoid dealing with negative exponent, a constant positive bias is added to the exponent before storing.

One bit is used to store information about the sign.

Standard formats for binary floating point numbers and arithmetic on them were specified in the IEEE 754 standard and its two main formats are available on most CPUs and GPUs today:

Single Precision (32-bit)
- 1 sign bit.
- 8 exponent bits (base 2), biased by 127.
- 23 significand bits (plus 1 hidden bit).
- Approximately 7 decimal digits of precision.
- Range is roughly $\pm 10^{-38}$ to $\pm 10^{38}$.
Double Precision (64-bit)
- 1 sign bit.
- 11 exponent bits, biased by 1023.
- 52 significand bits (plus the hidden bit).
- Approximately 15–17 decimal digits of precision.
- Range is roughly $\pm 10^{-308}$ to $\pm 10^{308}$.

Here’s a Python example demonstrating how floating point numbers are stored in binary:

import struct
import binascii

def show_float_bits(f):
    # Pack float into 4 bytes (32 bits) using IEEE 754 format
    packed = struct.pack('!f', f)
    
    # Convert to binary string
    bits = bin(int.from_bytes(packed, 'big'))[2:].zfill(32)
    
    # Split into sign, exponent, and mantissa
    sign = bits[0]
    exponent = bits[1:9]
    mantissa = bits[9:]
    
    # Convert exponent bits to decimal and remove bias
    exp_decimal = int(exponent, 2)
    unbiased_exp = exp_decimal - 127
    
    # Convert mantissa bits to decimal fraction
    mantissa_decimal = 0
    for i, bit in enumerate(mantissa, 1):
        mantissa_decimal += int(bit) * (2 ** -i)
    # Add the implicit leading 1
    mantissa_decimal += 1
    
    print(f"Number: {f}")
    print(f"Sign bit: {sign} ({'negative' if sign=='1' else 'positive'})")
    print(f"Exponent bits: {exponent} (decimal: {exp_decimal}, unbiased: {unbiased_exp})")
    print(f"Mantissa bits: {mantissa}")
    print(f"Mantissa value: {mantissa_decimal:.10f}")
    print(f"Full binary: {bits}")
    print(f"Reconstruction: {'-' if sign=='1' else ''}{mantissa_decimal} × 2^{unbiased_exp}")

# Example with a positive number
print("Positive example:")
show_float_bits(3.14)
print("\nNegative example:")
show_float_bits(-0.15625)

Positive example:
Number: 3.14
Sign bit: 0 (positive)
Exponent bits: 10000000 (decimal: 128, unbiased: 1)
Mantissa bits: 10010001111010111000011
Mantissa value: 1.5700000525
Full binary: 01000000010010001111010111000011
Reconstruction: 1.5700000524520874 × 2^1

Negative example:
Number: -0.15625
Sign bit: 1 (negative)
Exponent bits: 01111100 (decimal: 124, unbiased: -3)
Mantissa bits: 01000000000000000000000
Mantissa value: 1.2500000000
Full binary: 10111110001000000000000000000000
Reconstruction: -1.25 × 2^-3

Note how -0.15625 has an exact representation because it’s a multiple of a negative power of 2 (it’s -5/32), while 3.14 can only be approximated in binary floating point.

19.3.6 7.4 Special Values and Encodings

IEEE 754 reserves certain exponent patterns to represent:

Infinity: If the exponent is all 1’s (binary 11111111 for single precision) and the fraction bits are all 0, this encodes $\pm \infty$ (sign bit determines which). One use of these values is to return the result of an operation where overflow occurs.
NaN (Not a Number): If the exponent is all 1’s and the fraction bits are not all zero, we have NaN. NaN values are used as the result of applying functions to values outside their domain (for example square root of negative numbers). They are also used to indicate missing values.
Zero: Represented by all bits in exponent and fraction being zero, sign bit can be 0 or 1 (so +0 and -0 exist, interestingly).

19.3.7 Choosing the floating point format for a task

For simple programs the best choice is to use double precision arithmetic. This let’s us avoid rounding off issues in most cases.

For data analysis and machine learning tasks with large data sets where hardware resources are constrained a careful consideration of tradeoffs becomes necessary.

19.3.7.1 8.1 Memory Footprint

Single Precision: 4 bytes per number. Storing 1 million floats = roughly 4 MB.
Double Precision: 8 bytes per number. Storing 1 million doubles = roughly 8 MB.

In large-scale machine learning, such as training a deep neural network with billions of parameters, doubling memory usage can limit the model size or batch size you can hold in GPU/CPU memory.

19.3.7.2 Bandwidth and Throughput

Data transfer is often a bottleneck in HPC (High-Performance Computing) and ML systems. Halving the size of each floating-point number effectively doubles how fast data can move through memory channels. Moreover, many GPUs can perform single-precision math operations at a higher throughput than double-precision operations.

19.3.7.3 Computational Speed

In addition to bandwidth benefits, single-precision arithmetic instructions can execute faster on many architectures—particularly consumer or gaming GPUs repurposed for deep learning. Double precision, while more accurate, usually has a performance penalty in terms of floating-point operations per second (FLOPS).

19.3.7.4 Accuracy Requirements

7 decimal digits (single precision) might be enough for many ML tasks (image classification, speech recognition, etc.), especially because these algorithms tolerate small numerical “noise.”
15–17 decimal digits (double precision) is standard for more precise fields like financial calculations (to avoid major rounding errors in large sums), or certain scientific computations where error can accumulate over millions of steps.

In fact, for many machine learning situations accuracy of even less than single precision has been found to be usable. IEEE754 specifies a 16-bit half-precision floating point type. Another 16-bit type widely supported in hardware is the bfloat16 type. Compared to IEEE half-precision, bfloat16 has less precision but a larger range, and tends to be more useful for machine learning tasks.

19.4 Text

19.4.1 Unicode and Code Points

At its core, computers can only store numbers, so text must be represented by mapping characters to numbers. This mapping is called a character encoding. When you type a letter ‘A’ on your keyboard, it needs to be converted to a number before it can be stored or processed by a computer. Similarly, when displaying text on screen, the computer must convert numbers back into visual symbols.

The first widely adopted standard for this mapping was ASCII (American Standard Code for Information Interchange). ASCII used 7 bits to represent 128 different characters:

Letters A-Z (both uppercase and lowercase)
Digits 0-9
Common punctuation marks (!@#$% etc.)
Control characters like newline (), tab (, and carriage return (

However, 128 characters proved far too limited for global use. Many languages use characters beyond the basic Latin alphabet—accented letters, Chinese characters, Arabic script, and more. Thus, modern systems rely on Unicode, which assigns a unique number (code point) to characters from nearly all of humanity’s writing systems, as well as various symbols, emojis, and even ancient scripts like Egyptian hieroglyphs.

Some examples of Unicode code points include:

‘A’: U+0041 (decimal 65)
‘世’: U+4E16 (decimal 19990)
‘🙂’: U+1F642 (decimal 128578)

The “U+” notation is the standard way to write Unicode code points, where “U+” is followed by a hexadecimal number. The “U” stands for Unicode, and the “+” indicates that what follows is a hexadecimal value. For example, “U+0041” represents the letter ‘A’, where “0041” is the hexadecimal representation of the decimal number 65.

The Unicode standard now encompasses over 140,000 characters. While it specifies code points, there are multiple ways to encode these code points into bytes for storage or transmission. The most common encoding is UTF-8, which is a variable-length scheme:

Characters that fall within the standard ASCII range (0–127) use 1 byte.
Many European and Middle Eastern characters require 2 bytes.
Most East Asian characters typically require 3 bytes.
Less common symbols and many emojis occupy 4 bytes.

This design allows backward compatibility with older ASCII-based systems while still accommodating the global variety of characters in existence.

Unicode handles complex scripts like Devanagari through a sophisticated system of combining characters and rules for their arrangement. For instance, in Devanagari, the vowel sign ‘ि’ (i) is stored after the consonant it appears with, even though it’s displayed before it when rendered. Unicode defines not just the code points but also the rules for how these characters combine and interact. This includes handling of ligatures (like क्ष), half-forms of consonants, and the positioning of various marks relative to base characters.

Here’s a Python example showing how the Hindi word क्षणिक (“momentary”) is represented as Unicode code points:

import unicodedata
word = "क्षणिक"
for char in word:
    print(f"Character: {char}")
    print(f"Unicode name: {unicodedata.name(char)}")
    print(f"Code point: U+{ord(char):04X}")
    print()

Character: क
Unicode name: DEVANAGARI LETTER KA
Code point: U+0915

Character: ्
Unicode name: DEVANAGARI SIGN VIRAMA
Code point: U+094D

Character: ष
Unicode name: DEVANAGARI LETTER SSA
Code point: U+0937

Character: ण
Unicode name: DEVANAGARI LETTER NNA
Code point: U+0923

Character: ि
Unicode name: DEVANAGARI VOWEL SIGN I
Code point: U+093F

Character: क
Unicode name: DEVANAGARI LETTER KA
Code point: U+0915

Note how the word is composed of individual Unicode characters for consonants (क, ष, ण, क), the virama (्) that creates conjunct forms, and the vowel sign (ि). While we see क्ष rendered as a single glyph, it’s actually stored as three separate Unicode code points (क + ् + ष).

The actual rendering of these complex scripts relies heavily on advanced font technologies, particularly OpenType. OpenType fonts contain lookup tables and substitution rules that determine how characters should be displayed in different contexts. For example, when rendering Hindi text, the font’s logic determines when to create conjunct forms, how to position vowel marks relative to consonants, and when to use alternate glyph forms. These rules are script-specific—the same OpenType features that handle Devanagari conjuncts might handle Arabic ligatures or Thai vowel marks differently.

Modern text rendering engines like Harfbuzz use both Unicode’s character properties and OpenType font features to correctly display text. They first analyze the Unicode text to identify character clusters and their properties, then apply the appropriate OpenType features from the font to generate the final visual form. This separation of concerns—Unicode for character encoding and OpenType for visual rendering—allows for both accurate text processing and beautiful typography across the world’s writing systems.

19.5 Code

19.5.1 Machine Instructions and Assembly

At the lowest level, computers execute machine instructions—simple operations encoded as binary numbers. Each instruction contains an operation code (opcode) that tells the CPU what to do, and operands that specify what to do it with. These operands might be registers (small, fast storage locations in the CPU), memory addresses, or immediate values (constants).

Different CPU families use different instruction sets—the fundamental set of operations they can perform. The three most important instruction set architectures (ISAs) today are:

x86: Developed by Intel in 1978 for the 8086 processor, and used in processors by Intel and AMD, this architecture dominates desktop and server computing.
ARM: Originally developed by Acorn Computers in 1985, ARM (Advanced RISC Machine). Initially designed for low-power mobile devices, ARM now powers most smartphones and tablets. In 2020, Apple began transitioning their Mac computers from Intel x86 to their own ARM-based processors, citing better performance per watt.
RISC-V: A newer, open-source ISA developed at UC Berkeley in 2010. Unlike x86 and ARM which require licensing fees, RISC-V is free to use. Its clean, modern design has gained significant traction in embedded systems, and companies like Western Digital and NVIDIA are developing RISC-V processors.

Assembly language provides a human-readable representation of machine instructions. Instead of writing binary numbers, programmers can use mnemonics like add or load along with register names and numbers. Here’s a RISC-V assembly program that calculates the sum of numbers from 1 to 100:

    li x5, 100  
    li x6, 0       
    li x7, 0     
loop:   
    addi x7, x7, 1   
    add x6, x6, x7   
    bne x7, x5, loop

Here’s what the program means:

# Initialize register x5 with our target number (n=100)
# li means load immediate, that is load a constant
#     value (aka immediate) into a register.
# Instructions also exist for loading from and storing to memory.
    li x5, 100  

# Initialize register x6 to 0. It will hold the running sum  
    li x6, 0       

# Initialize register x7 to 0. It will hold the current number
#     being added
    li x7, 0     

# Label 'loop' marks where we'll jump back to
# Labels are not instructions, they are markers for
#     the assembler program.
loop:   

# x7 = x7 + 1
# The i in addi stands for immediate, i.e. constant             
    addi x7, x7, 1   

# x6 = x6 + x7
    add x6, x6, x7   

# Branch if Not Equal: if x7 ≠ x5, jump back to 'loop'
    bne x7, x5, loop 

# When x7 equals 100, we reach here
# Now x6 has the sum which can be used by the 
#   rest of the program

Each assembly instruction typically translates to exactly one machine instruction. The assembler converts these mnemonics into their binary equivalents that the CPU can execute directly.

The CPU reads these binary instructions one by one, decodes what operation to perform and what data to use, executes the operation, and moves to the next instruction. This process forms the foundation of all computer programs, though most programmers work at much higher levels of abstraction.

19.5.2 High-level Languages and Compilation

It is unwieldy and time-consuming for human programmers to write applications directly in machine code or even assembly. High-level languages like C, C++, Rust, Fortran, and many more are designed to be more expressive and human-readable. A compiler translates this high-level source code into machine instructions appropriate for a particular hardware architecture.

19.5.3 Interpreted Languages and Bytecode

Other popular languages, such as Python, Ruby, and JavaScript, use an interpreter that executes the code directly at runtime. Unlike compilation where code is translated to machine instructions ahead of time, interpretation processes and executes the code on the fly. This approach offers greater flexibility and portability since the same code can run on any platform with an appropriate interpreter, but it typically runs more slowly than compiled code. Some interpreters use just-in-time (JIT) compilation to improve performance by translating frequently executed code portions into machine instructions during runtime.

19.5.4 Trade-offs Between Compiled and Interpreted Approaches

Both compilation and interpretation have their places:

Compiled Languages
- Typically faster at runtime because the translation overhead is done ahead of execution.
- Less flexible in certain respects, with fewer run-time checks or dynamic features.
- Produces architecture-specific executables that may need recompilation to run on different CPUs.
Interpreted Languages
- Often more flexible and portable, enabling quick development and easy debugging.
- Inherently slower if purely interpreted because every instruction is repeatedly decoded at runtime.
- Can leverage just-in-time compilation for improved performance on critical code paths.

In many contemporary data science and machine learning workflows, developers use interpreted languages such as Python for high-level program logic, while relying on compiled libraries for performance-intensive numeric tasks. This hybrid approach allows rapid prototyping of algorithms in Python while still harnessing the raw speed of optimized C++ or Fortran routines underneath libraries such as NumPy, SciPy, or specialized machine learning frameworks.

19.5.5 Code is data

You might wonder why we’re discussing code in a chapter about data types. The key insight is that code itself is just another form of data that must be stored in computer memory. When you write a program, whether in assembly or a high-level language, it ultimately gets converted into binary data that the computer can process. This binary representation of instructions is stored and manipulated just like any other data type.

This dual nature of code-as-data enables the creation of programs that transform other programs. Compilers and interpreters are themselves programs that read source code (stored as text data) and either produce machine code (in the case of compilers) or execute the instructions directly (in the case of interpreters). The code lives in the computer’s storage, whether disk or RAM, just like any other data. Thought of in this way, machine ISAs and high-level languages are just another data format.

19.6 Conclusion

Machine-level data types form the bedrock of all software operations, including those in advanced fields like machine learning for economics. Unsigned integers provide efficient representations of nonnegative quantities but can overflow without warning. Signed integers, implemented universally via two’s complement, allow for positive and negative values yet also exhibit subtle wrapping behaviors that can drastically alter outcomes if ranges are exceeded. Floating-point numbers approximate real values with a careful balance between range and precision, as dictated by the IEEE 754 standard. Text is stored using encodings like UTF-8, which extend older ASCII definitions to handle a far broader set of global characters and symbols. Finally, all higher-level software must, at the end of the chain, be translated into the binary instructions a processor can directly execute.

By mastering these low-level concepts, economists, data scientists, and other technical specialists can write more robust algorithms, interpret computational results with greater confidence, and anticipate how hardware-level details might affect both performance and correctness.

# Machine-level Data Types When working in machine learning for graduate economics—or any computationally intensive field—understanding how data is represented at the hardware level is vital for ensuring numerical accuracy, preventing bugs, and making informed decisions about performance. High-level abstractions often mask the underlying details of how computers store integers, handle floating-point arithmetic, encode text, and interpret instructions. Yet these low-level details can profoundly affect the outcomes of computations, especially when dealing with large-scale data, sensitive financial figures, or tightly optimized algorithms. This appendix offers a thorough overview of the main categories of machine-level data types: unsigned integers, signed integers, floating-point numbers, text, and executable code. The following sections include practical examples using NumPy, Python's fundamental package for numerical computing. If you haven't already, install NumPy with: ```bash pip install numpy ``` ## Unsigned Integers ### Binary Representation and Fixed-width Storage At the heart of computing lies the binary number system, which uses just two digits, 0 and 1. Despite the simplicity of having only these two symbols, a binary representation can encode any nonnegative integer if enough bits are provided. Languages such as Python allow integers to grow arbitrarily large, but physical hardware operates with fixed-width registers and instructions that handle a specific number of bits at once—commonly 8, 16, 32, or 64 bits. For example, the binary number 1101 is converted to decimal as follows: 1101 (binary) = (1 × 2³) + (1 × 2²) + (0 × 2¹) + (1 × 2⁰) = (1 × 8) + (1 × 4) + (0 × 2) + (1 × 1) = 8 + 4 + 0 + 1 = 13 (decimal) Each position represents a power of 2, starting from 2⁰ (1) on the right and increasing as we move left. By multiplying each binary digit by its corresponding power of 2 and summing the results, we convert from binary to decimal. When an integer is said to be “unsigned,” it means all possible bit patterns are interpreted as nonnegative values. If you have $n$ bits, there are $2^n$ distinct patterns of 0s and 1s. Because the smallest value is 0, the largest representable number is $2^n - 1$. For instance, with 8 bits (often called a “byte”) we have: - Smallest value (binary 00000000): 0 - Largest value (binary 11111111): 255, which is $2^8 - 1$ Extending this principle: - A 16-bit unsigned integer ranges from 0 to $2^{16} - 1 = 65{,}535$. - A 32-bit unsigned integer ranges from 0 to $2^{32} - 1 = 4{,}294{,}967{,}295$. - A 64-bit unsigned integer ranges from 0 to $2^{64} - 1 = 18{,}446{,}744{,}073{,}709{,}551{,}615$. Such integers are routinely employed in contexts like indexing arrays, counting objects, or storing memory addresses—situations where negative numbers are not meaningful. In a 32-bit system, for example, memory addresses typically require 32 bits, thus each address is an unsigned integer from 0 to $2^{32}-1$. On a 64-bit system, addresses use 64-bit unsigned values. ### Overflow, Underflow, and the Clock Analogy Because the bit width is fixed in hardware arithmetic, computations can “wrap around” if a result exceeds the representable range. This phenomenon is referred to as overflow (when adding or otherwise increasing causes the value to jump from the maximum back to something smaller, often zero) or underflow (when subtracting causes the value to jump from zero to a high value). A helpful analogy is a clock—except that instead of having 12 or 24 discrete positions, an $n$-bit unsigned integer has $2^n$ positions. In an 8-bit system, imagine a clock with 256 positions labeled 0 through 255: - **Overflow** occurs if you start at 255 and add 1, wrapping around to 0. For instance, if a register holds 254 and you add 1, you get 255, which is still within range. If you add another 1, you wrap to 0. - **Underflow** is analogous to turning the clock backward. If the register holds 0 and you subtract 1, it wraps around to 255. ```python import numpy as np # Create array of unsigned 8-bit integers (0 to 255) arr = np.array([254, 255, 0, 1], dtype=np.uint8) print(f"Initial array: {arr}") # Demonstrate overflow print(f"Adding 1: {arr + 1}") # 255 wraps to 0 print(f"Adding 2: {arr + 2}") # 255 wraps to 1 ``` ### Hexadecimal notation Hexadecimal (base-16) notation is a convenient way to represent binary numbers, as each hexadecimal digit represents exactly four binary digits (bits). The sixteen digits are 0-9 for the first ten values, then A-F for values 10-15. For example: - Binary 0000 = Hex 0 - Binary 1001 = Hex 9 - Binary 1010 = Hex A - Binary 1111 = Hex F This makes it easy to convert between binary and hexadecimal: - Binary 1101 0011 = Hex D3 (since 1101=D and 0011=3) - Hex 2F = Binary 0010 1111 Hexadecimal numbers are often prefixed with "0x" in programming languages to distinguish them from decimal numbers. For instance: - 0xFF = 255 in decimal - 0x100 = 256 in decimal - 0xA5 = 165 in decimal This notation is particularly useful when working with computer memory addresses or examining binary data, as it makes long strings of bits more manageable while maintaining a direct relationship to the underlying binary representation. Here's how to display integers in different formats using Python f-strings: ```{python} #| eval: true num = 42 # Print in decimal (base 10) print(f"Decimal: {num}") # Print in binary with 0b prefix print(f"Binary: {num:#b}") # Print in hex with 0x prefix print(f"Hex: {num:#x}") # Print in binary padded to 8 bits print(f"8-bit binary: {num:08b}") ``` The `#` modifier adds the `0b` or `0x` prefix. The `08b` format specifier pads the binary representation with leading zeros to 8 bits. ## Signed Integers ### The Two’s Complement Convention Unlike unsigned integers, signed integers can represent both positive and negative values. Modern hardware invariably does this using a scheme called “two’s complement.” Although historical architectures employed other representations, such as one’s complement or signed-magnitude, they have fallen out of use. Two’s complement is essentially a systematic way to encode negative numbers by treating bit patterns above a certain threshold as representing negative values. To grasp the logic, consider an 8-bit integer. If one interprets all 8-bit patterns as unsigned, they range from 0 to 255. Two’s complement reassigns the higher half of these patterns (128 through 255) to represent negative numbers, by subtracting 256 from the unsigned interpretation: - 00000000 (binary) = 0 (decimal) - … - 01111111 (binary) = 127 (decimal) - 10000000 (binary) = 128 in unsigned interpretation, but in two’s complement it is $-128$. - 10000001 (binary) = 129 in unsigned, but $-127$ in two’s complement. - … - 11111111 (binary) = 255 in unsigned, but $-1$ in two’s complement. Hence, an 8-bit two’s complement integer spans $-128$ to $+127$. Generalizing, an $n$-bit two’s complement integer has a range from $-2^{n-1}$ to $2^{n-1} - 1$. For example, a 16-bit signed integer covers $-32768$ to $32767$, a 32-bit signed integer covers approximately $-2.1 \times 10^9$ to $+2.1 \times 10^9$, and so on. ### A clock analogy The behavior of two’s complement can also be understood through clock arithmetic. Suppose you have a traditional clock showing 2:45 (in hours and minutes). We can label this time in two different ways: we might say it is 45 minutes after 2:00, or equivalently that it is 15 minutes before 3:00, giving us -15 minutes relative to 3:00. This negative offset simply reflects a different convention in labeling the same position on the clock. In an 8-bit two’s complement system, once the value goes beyond 127, it “wraps around” into the negative region. The range 128–255 is reused to represent negative integers from $-128$ up to $-1$. As an example, 148 in unsigned arithmetic might appear simply as 148. But if one interprets that same bit pattern in two’s complement, it becomes $148 - 256 = -108$. This mirrors the idea that 45 minutes can be -15 if you consider it as time until 3:00, simply by shifting your frame of reference. ### Overflow and Sign Changes Because two’s complement is basically a circular numerical system on $2^n$ points, arithmetic operations can yield results that “wrap around,” causing surprising sign changes. If you add two positive numbers and the result surpasses $2^{n-1} - 1$ (the largest positive representable value in $n$-bit two’s complement), you loop back into the negative values. For example, in an 8-bit system ($-128$ to $127$): - Adding 100 and 50 ideally yields 150, but 150 exceeds 127. The result wraps to $150 - 256 = -106$. - Adding $-100$ and $-50$ ideally gives $-150$. Since $-150$ is less than $-128$, it wraps around to $-150 + 256 = 106$. Even multiplication can produce unintuitive sign flips. For instance, multiplying 64 by 5 is 320 in normal arithmetic. In an 8-bit two’s complement system, 320 is far above 127, so it wraps around to $320 - 256 = 64$. Despite both operands being positive, the final result in 8-bit arithmetic is 64, demonstrating how drastically overflow can alter outcomes. ```python import numpy as np # Create array of signed 8-bit integers (-128 to 127) arr = np.array([100, 50, -100, -50], dtype=np.int8) print(f"Initial array: {arr}") # Demonstrate overflow with addition print(f"100 + 50 = {arr[0] + arr[1]}") # Wraps to negative print(f"-100 + -50 = {arr[2] + arr[3]}") # Wraps to positive # Demonstrate overflow with multiplication x = np.array([64, 5], dtype=np.int8) print(f"64 × 4 = {x[0] * 4}") # Wraps around print(f"64 × 5 = {x[0] * x[1]}") # Wraps around ``` These behaviors underscore why safety checks are crucial. In fields like financial data processing or physical simulations, accidental overflow can invalidate entire computations. Programmers may mitigate these risks by using larger integer sizes (e.g., 64-bit instead of 32-bit), using arbitrary-precision arithmetic libraries, or explicitly detecting and handling overflows. Nonetheless, two’s complement remains the universal standard for signed integer representation because it is straightforward to implement in hardware and provides correct results for most practical calculations within its designated range. ## Floating point numbers In everyday arithmetic, you might jot down a decimal like `3.14` on paper, knowing it represents a truncated form of the irrational number π (pi). Similarly, on a computer, we usually cannot store exact real numbers; instead, we rely on *approximations* dictated by how data is physically stored in memory. The most common way fractional numbers are represented on computers are as **floating point** numbers. But before looking at them we'll look at the simpler **fixed-point** representation and see why it is not usually the best choice. Though computers of course are going to use binary numbers, we will use decimal (base-10) numbers to explain the conceptual issues, since the base does not make a significant difference to the ideas. ### Fixed-Point Representation In a fixed-point system, we allocate a certain number of digits (in decimal) or bits (in binary) to the *fractional part* of a number. For a concrete example, imagine you want two digits to the right of the decimal point in decimal notation, maybe because we are representating monetary quantities in rupees and paise. One simple way to do it is to multiple all quantities by 100 and store them as integers. - The number `37.42` is stored as the integer `3742`. We interpret `3742` as `37.42` because of the implicit rule that the last two digits belong after the decimal point. - Similarly, the number `0.05` becomes `5` internally, since we again assume the last two digits are after the decimal. - If the number has more digits after the decimal point, we round it first. So `1.324` would first be rounded to `1.32` and then stored as `132`. This storage approach is conceptually simple: you just treat every “fixed-point number” as a scaled integer. Only when displaying them we remember to place a decimal point before the last two digits. A hallmark of fixed-point is **constant absolute precision** (or resolution). In the above example with two decimal digits: - The smallest increment is `0.01`. You can represent `37.41` and `37.42`, but not `37.415`. - Hence, the maximum rounding error for any number is ±0.005 in decimal. - Another way to see this is that the space between any two neighboring numbers is 0.01—this gap never changes, regardless of whether you’re around `1.00` or `9999.99`. If your range is small or you only need a fixed absolute precision, this is actually very convenient. However, as soon as you need to represent both extremely large and extremely small values with precision, the uniform step size becomes a bottleneck. This limitation is usually described as “fixed-point has a fixed range.” Indeed, if you want to store bigger numbers, you must increase the integer storage—and that necessarily squeezes out bits (or digits) that could have been used for the fractional part, lowering the precision. If you want very high precision, you have fewer bits for the integer part, lowering the maximum representable value. If constant absolute error is what you need, there is no way out of this dilemma except to allocate more bits to your representation. However, it turns out that in most numeric applications you don't need constant absolute error. It is constant *relative* error that is more important. ### Relative vs. Absolute Error - **Absolute Error**: The absolute value of the difference between the true value and the represented value. So representing 500 as 502 involves an absolute error of 2. - **Relative Error**: The ratio of the absolute error to the value’s magnitude. So representing 500 as 502 has a relative error of $2/500$. In many scientific or mathematical scenarios, *relative* precision is what matters. If you are calculating the budget of the Government of India, an error of a few lakh rupees is negligible. If you are calculating the budget of a professor's household, such an error can be catastrophic! How would you store numbers if you were willing to accept varying absolute errors as long as relative errors stay constant? If you were of a mathematical bent of mind you might get the idea of taking logarithm and then storing the logarithms as fixed-precision numbers since logarithms convert absolute changes into relative changes. So an additive error of $\epsilon$ in storing the number would lead to a multiplicative, i.e. relative, error of $10^\epsilon$ in the actual numbers if we were using base-10 logarithms. In return, the range of numbers you could represent with a fixed number of digits would be greatly increased. With only two decimal digits for the integral portion you would be able to represent numbers as large as $10^{99}$. However, this scheme has two problems: - Logarithms are expensive to compute - Multiplication and division would be easy (which is what logarithms were invented for), but additions and subtractions would require us to take exponentials, carry out the operations and take logarithms again. Since additions/subtractions are much more common than multiplications/divisions, this would not be a very good situation. Floating point representation is a halfway house between fixed-point representation and logarithmic representation which addresses these issues. ### Floating point representation The floating point representation of a positive number $f$ is constructed by finding two numbers $m$ and $a$ such that we have $$f=m \times 10^a$$ with $$1 \le m < 10$$ and $a$ an integer. Simply put, $a$ is the number of times $f$ has to be divided or multipied by 10 to bring it into the 1 to 10 range. So the representation of $123$ would be $1.23\times 10^2$ while that of $0.0123$ would be $1.23 \times 10^{-2}$. The number $m$ is called the **mantissa** or **significand** and $a$ the **exponent** of $f$. We assign a certain fixed number of digits to encode the mantissa and another certain fixed number of digits to encode the exponent. The number of digits assigned to the mantissa controls the *relative precision* of our representation. Say we have only three decimal digits for the mantissa. Then if we are asked to represent $1234$ we have to round it off to $1230 = 1.23 \times 10^3$ We have an absolute error of $4$ and a relative error of $4/1234$. If instead we had to represent $0.1234$ we would have to round it off it $0.123 = 1.23 \times 10^{-1}$. In this case we have an absolute error of $0.0004$ and a relative error of $0.0004/0.1234$ which is the exactly the same as our earlier relative error of $4/1234$. So the absolute error changes with the scale of the number represented, but the relative error remains the same, depending only on the number of digits in the mantissa. The number of digits assigned to the exponent determine the range of values we can represent. Working in base 10, if we assign two digits to the exponent and five to the mantissa the largest number we can store is $9.9999 \times 10^{99}$ and the smallest number we can store is $1.0000 \times 10^{-99}$, which is a much greater range than would have been possile with a fixed-point representation with the same number of digits. The range of a floating point format has two “holes” on the real number line, numbers in which canonot be well-approximated by floating point number in that format. The first hole is for small number very close to zero, in the example above betweeen $0$ and $1.0000 \times 10^{-99}$, The other hole is for very large numbers, greater than $9.9999 \times 10^{99}$ in our example. When the result of a calculation falls in the first hole, we say we have **underflow**. When it falls in the second hole we say we have **overflow**. If we had been willing to relax the constraint $1 \le m < 10$ on the mantissa we could represent even smaller numbers, for eg. $0.0001 \times 10^{-99}$. This has a cost though. If we relax the constraint, we no longer have fixed relative errors. But for many cases paying this accuracy cost is better than having small results being rounded to zero. Small floating point numbers which violate the constrain on the mantissa are called **denormal number** and allowing results to be represented as such numbers is called allowing for **gradual underflow**. Finally, if we store information about signs separately, we can represent negative as well as positive numbers. ### Roundoff Error With a fixed number of digits to represent the significand, situations arise when the result of an arithmetic operation cannot be represented exactly and have to be rounded off. For example, say we have four decimal digits for the significant and we calculate $5.565 + 5.654$. The exact result is $11.219$ but since we have only three digits for the significand this has to be represented as $1.122 \times 10$. In stastical and machine learning work when final results are a results of many arithmetic operations, roundoff erros like these can accumulate and significantly affect results. For example, just calculating the mean of a million numbers would require a million additions, and in unfortunate situations this can have a large impact on the result. Fitting a model to data using an optimization algorithm involves a much larger number of arithmetic operations and a much larger scope for accumulating round off errors. ### Binary Floating-Point and the IEEE 754 Standard Coming to the acutal binary floating point formats used in computers today, nothing significant changes from our discussion above except that the normalization condition on the significand becomes $1 \le m < 2$. This means that the first digit, the digit to the left of the decimal point in the mantissa, has to be $1$ and so there is no need to explicitly store it. Also, to avoid dealing with negative exponent, a constant positive *bias* is added to the exponent before storing. One bit is used to store information about the sign. Standard formats for binary floating point numbers and arithmetic on them were specified in the IEEE 754 standard and its two main formats are available on most CPUs and GPUs today: 1. **Single Precision (32-bit)** - 1 sign bit. - 8 exponent bits (base 2), biased by 127. - 23 significand bits (plus 1 hidden bit). - Approximately 7 decimal digits of precision. - Range is roughly $\pm 10^{-38}$ to $\pm 10^{38}$. 2. **Double Precision (64-bit)** - 1 sign bit. - 11 exponent bits, biased by 1023. - 52 significand bits (plus the hidden bit). - Approximately 15–17 decimal digits of precision. - Range is roughly $\pm 10^{-308}$ to $\pm 10^{308}$. Here's a Python example demonstrating how floating point numbers are stored in binary: ```{python} #| eval: true import struct import binascii def show_float_bits(f): # Pack float into 4 bytes (32 bits) using IEEE 754 format packed = struct.pack('!f', f) # Convert to binary string bits = bin(int.from_bytes(packed, 'big'))[2:].zfill(32) # Split into sign, exponent, and mantissa sign = bits[0] exponent = bits[1:9] mantissa = bits[9:] # Convert exponent bits to decimal and remove bias exp_decimal = int(exponent, 2) unbiased_exp = exp_decimal - 127 # Convert mantissa bits to decimal fraction mantissa_decimal = 0 for i, bit in enumerate(mantissa, 1): mantissa_decimal += int(bit) * (2 ** -i) # Add the implicit leading 1 mantissa_decimal += 1 print(f"Number: {f}") print(f"Sign bit: {sign} ({'negative' if sign=='1' else 'positive'})") print(f"Exponent bits: {exponent} (decimal: {exp_decimal}, unbiased: {unbiased_exp})") print(f"Mantissa bits: {mantissa}") print(f"Mantissa value: {mantissa_decimal:.10f}") print(f"Full binary: {bits}") print(f"Reconstruction: {'-' if sign=='1' else ''}{mantissa_decimal} × 2^{unbiased_exp}") # Example with a positive number print("Positive example:") show_float_bits(3.14) print("\nNegative example:") show_float_bits(-0.15625) ``` Note how -0.15625 has an exact representation because it's a multiple of a negative power of 2 (it's -5/32), while 3.14 can only be approximated in binary floating point. ### 7.4 Special Values and Encodings IEEE 754 reserves certain exponent patterns to represent: - **Infinity**: If the exponent is all 1’s (binary 11111111 for single precision) and the fraction bits are all 0, this encodes $\pm \infty$ (sign bit determines which). One use of these values is to return the result of an operation where overflow occurs. - **NaN (Not a Number)**: If the exponent is all 1’s and the fraction bits are *not* all zero, we have NaN. NaN values are used as the result of applying functions to values outside their domain (for example square root of negative numbers). They are also used to indicate missing values. - **Zero**: Represented by all bits in exponent and fraction being zero, sign bit can be 0 or 1 (so +0 and -0 exist, interestingly). ### Choosing the floating point format for a task For simple programs the best choice is to use double precision arithmetic. This let's us avoid rounding off issues in most cases. For data analysis and machine learning tasks with large data sets where hardware resources are constrained a careful consideration of tradeoffs becomes necessary. #### 8.1 Memory Footprint - **Single Precision**: 4 bytes per number. Storing 1 million floats = roughly 4 MB. - **Double Precision**: 8 bytes per number. Storing 1 million doubles = roughly 8 MB. In large-scale machine learning, such as training a deep neural network with billions of parameters, doubling memory usage can limit the model size or batch size you can hold in GPU/CPU memory. #### Bandwidth and Throughput Data transfer is often a bottleneck in HPC (High-Performance Computing) and ML systems. Halving the size of each floating-point number effectively doubles how fast data can move through memory channels. Moreover, many GPUs can perform single-precision math operations at a higher throughput than double-precision operations. #### Computational Speed In addition to bandwidth benefits, single-precision arithmetic instructions can execute faster on many architectures—particularly consumer or gaming GPUs repurposed for deep learning. Double precision, while more accurate, usually has a performance penalty in terms of floating-point operations per second (FLOPS). #### Accuracy Requirements - **7 decimal digits** (single precision) might be enough for many ML tasks (image classification, speech recognition, etc.), especially because these algorithms tolerate small numerical “noise.” - **15–17 decimal digits** (double precision) is standard for more precise fields like financial calculations (to avoid major rounding errors in large sums), or certain scientific computations where error can accumulate over millions of steps. In fact, for many machine learning situations accuracy of even less than single precision has been found to be usable. IEEE754 specifies a 16-bit half-precision floating point type. Another 16-bit type widely supported in hardware is the `bfloat16` type. Compared to IEEE half-precision, `bfloat16` has less precision but a larger range, and tends to be more useful for machine learning tasks. ## Text ### Unicode and Code Points At its core, computers can only store numbers, so text must be represented by mapping characters to numbers. This mapping is called a **character encoding**. When you type a letter 'A' on your keyboard, it needs to be converted to a number before it can be stored or processed by a computer. Similarly, when displaying text on screen, the computer must convert numbers back into visual symbols. The first widely adopted standard for this mapping was ASCII (American Standard Code for Information Interchange). ASCII used 7 bits to represent 128 different characters: - Letters A-Z (both uppercase and lowercase) - Digits 0-9 - Common punctuation marks (!@#$% etc.) - Control characters like newline (\n), tab (\t), and carriage return (\r) However, 128 characters proved far too limited for global use. Many languages use characters beyond the basic Latin alphabet—accented letters, Chinese characters, Arabic script, and more. Thus, modern systems rely on **Unicode**, which assigns a unique number (code point) to characters from nearly all of humanity's writing systems, as well as various symbols, emojis, and even ancient scripts like Egyptian hieroglyphs. Some examples of Unicode code points include: - **‘A’**: U+0041 (decimal 65) - **‘世’**: U+4E16 (decimal 19990) - **‘🙂’**: U+1F642 (decimal 128578) The "U+" notation is the standard way to write Unicode code points, where "U+" is followed by a hexadecimal number. The "U" stands for Unicode, and the "+" indicates that what follows is a hexadecimal value. For example, "U+0041" represents the letter 'A', where "0041" is the hexadecimal representation of the decimal number 65. The Unicode standard now encompasses over 140,000 characters. While it specifies code points, there are multiple ways to encode these code points into bytes for storage or transmission. The most common encoding is **UTF-8**, which is a variable-length scheme: - Characters that fall within the standard ASCII range (0–127) use 1 byte. - Many European and Middle Eastern characters require 2 bytes. - Most East Asian characters typically require 3 bytes. - Less common symbols and many emojis occupy 4 bytes. This design allows backward compatibility with older ASCII-based systems while still accommodating the global variety of characters in existence. Unicode handles complex scripts like Devanagari through a sophisticated system of combining characters and rules for their arrangement. For instance, in Devanagari, the vowel sign 'ि' (i) is stored *after* the consonant it appears with, even though it's displayed *before* it when rendered. Unicode defines not just the code points but also the rules for how these characters combine and interact. This includes handling of ligatures (like क्ष), half-forms of consonants, and the positioning of various marks relative to base characters. Here's a Python example showing how the Hindi word क्षणिक ("momentary") is represented as Unicode code points: ```{python} #| eval: true import unicodedata word = "क्षणिक" for char in word: print(f"Character: {char}") print(f"Unicode name: {unicodedata.name(char)}") print(f"Code point: U+{ord(char):04X}") print() ``` Note how the word is composed of individual Unicode characters for consonants (क, ष, ण, क), the virama (्) that creates conjunct forms, and the vowel sign (ि). While we see क्ष rendered as a single glyph, it's actually stored as three separate Unicode code points (क + ् + ष). The actual rendering of these complex scripts relies heavily on advanced font technologies, particularly OpenType. OpenType fonts contain lookup tables and substitution rules that determine how characters should be displayed in different contexts. For example, when rendering Hindi text, the font's logic determines when to create conjunct forms, how to position vowel marks relative to consonants, and when to use alternate glyph forms. These rules are script-specific—the same OpenType features that handle Devanagari conjuncts might handle Arabic ligatures or Thai vowel marks differently. Modern text rendering engines like Harfbuzz use both Unicode's character properties and OpenType font features to correctly display text. They first analyze the Unicode text to identify character clusters and their properties, then apply the appropriate OpenType features from the font to generate the final visual form. This separation of concerns—Unicode for character encoding and OpenType for visual rendering—allows for both accurate text processing and beautiful typography across the world's writing systems. ## Code ### Machine Instructions and Assembly At the lowest level, computers execute **machine instructions**—simple operations encoded as binary numbers. Each instruction contains an operation code (opcode) that tells the CPU what to do, and operands that specify what to do it with. These operands might be registers (small, fast storage locations in the CPU), memory addresses, or immediate values (constants). Different CPU families use different instruction sets—the fundamental set of operations they can perform. The three most important instruction set architectures (ISAs) today are: - **x86**: Developed by Intel in 1978 for the 8086 processor, and used in processors by Intel and AMD, this architecture dominates desktop and server computing. - **ARM**: Originally developed by Acorn Computers in 1985, ARM (Advanced RISC Machine). Initially designed for low-power mobile devices, ARM now powers most smartphones and tablets. In 2020, Apple began transitioning their Mac computers from Intel x86 to their own ARM-based processors, citing better performance per watt. - **RISC-V**: A newer, open-source ISA developed at UC Berkeley in 2010. Unlike x86 and ARM which require licensing fees, RISC-V is free to use. Its clean, modern design has gained significant traction in embedded systems, and companies like Western Digital and NVIDIA are developing RISC-V processors. Assembly language provides a human-readable representation of machine instructions. Instead of writing binary numbers, programmers can use mnemonics like `add` or `load` along with register names and numbers. Here's a RISC-V assembly program that calculates the sum of numbers from 1 to 100: ```gnuassembler li x5, 100 li x6, 0 li x7, 0 loop: addi x7, x7, 1 add x6, x6, x7 bne x7, x5, loop ``` Here's what the program means: ```gnuassembler # Initialize register x5 with our target number (n=100) # li means load immediate, that is load a constant # value (aka immediate) into a register. # Instructions also exist for loading from and storing to memory. li x5, 100 # Initialize register x6 to 0. It will hold the running sum li x6, 0 # Initialize register x7 to 0. It will hold the current number # being added li x7, 0 # Label 'loop' marks where we'll jump back to # Labels are not instructions, they are markers for # the assembler program. loop: # x7 = x7 + 1 # The i in addi stands for immediate, i.e. constant addi x7, x7, 1 # x6 = x6 + x7 add x6, x6, x7 # Branch if Not Equal: if x7 ≠ x5, jump back to 'loop' bne x7, x5, loop # When x7 equals 100, we reach here # Now x6 has the sum which can be used by the # rest of the program ``` Each assembly instruction typically translates to exactly one machine instruction. The assembler converts these mnemonics into their binary equivalents that the CPU can execute directly. The CPU reads these binary instructions one by one, decodes what operation to perform and what data to use, executes the operation, and moves to the next instruction. This process forms the foundation of all computer programs, though most programmers work at much higher levels of abstraction. ### High-level Languages and Compilation It is unwieldy and time-consuming for human programmers to write applications directly in machine code or even assembly. High-level languages like C, C++, Rust, Fortran, and many more are designed to be more expressive and human-readable. A **compiler** translates this high-level source code into machine instructions appropriate for a particular hardware architecture. ### Interpreted Languages and Bytecode Other popular languages, such as Python, Ruby, and JavaScript, use an **interpreter** that executes the code directly at runtime. Unlike compilation where code is translated to machine instructions ahead of time, interpretation processes and executes the code on the fly. This approach offers greater flexibility and portability since the same code can run on any platform with an appropriate interpreter, but it typically runs more slowly than compiled code. Some interpreters use **just-in-time (JIT) compilation** to improve performance by translating frequently executed code portions into machine instructions during runtime. ### Trade-offs Between Compiled and Interpreted Approaches Both compilation and interpretation have their places: - **Compiled Languages** - Typically faster at runtime because the translation overhead is done ahead of execution. - Less flexible in certain respects, with fewer run-time checks or dynamic features. - Produces architecture-specific executables that may need recompilation to run on different CPUs. - **Interpreted Languages** - Often more flexible and portable, enabling quick development and easy debugging. - Inherently slower if purely interpreted because every instruction is repeatedly decoded at runtime. - Can leverage just-in-time compilation for improved performance on critical code paths. In many contemporary data science and machine learning workflows, developers use interpreted languages such as Python for high-level program logic, while relying on compiled libraries for performance-intensive numeric tasks. This hybrid approach allows rapid prototyping of algorithms in Python while still harnessing the raw speed of optimized C++ or Fortran routines underneath libraries such as NumPy, SciPy, or specialized machine learning frameworks. ### Code is data You might wonder why we're discussing code in a chapter about data types. The key insight is that code itself is just another form of data that must be stored in computer memory. When you write a program, whether in assembly or a high-level language, it ultimately gets converted into binary data that the computer can process. This binary representation of instructions is stored and manipulated just like any other data type. This dual nature of code-as-data enables the creation of programs that transform other programs. Compilers and interpreters are themselves programs that read source code (stored as text data) and either produce machine code (in the case of compilers) or execute the instructions directly (in the case of interpreters). The code lives in the computer's storage, whether disk or RAM, just like any other data. Thought of in this way, machine ISAs and high-level languages are just another data format. ## Conclusion Machine-level data types form the bedrock of all software operations, including those in advanced fields like machine learning for economics. Unsigned integers provide efficient representations of nonnegative quantities but can overflow without warning. Signed integers, implemented universally via two’s complement, allow for positive and negative values yet also exhibit subtle wrapping behaviors that can drastically alter outcomes if ranges are exceeded. Floating-point numbers approximate real values with a careful balance between range and precision, as dictated by the IEEE 754 standard. Text is stored using encodings like UTF-8, which extend older ASCII definitions to handle a far broader set of global characters and symbols. Finally, all higher-level software must, at the end of the chain, be translated into the binary instructions a processor can directly execute. By mastering these low-level concepts, economists, data scientists, and other technical specialists can write more robust algorithms, interpret computational results with greater confidence, and anticipate how hardware-level details might affect both performance and correctness.